9 research outputs found
Convex minorants of random walks and L\'evy processes
This article provides an overview of recent work on descriptions and
properties of the convex minorant of random walks and L\'evy processes which
summarize and extend the literature on these subjects.
The results surveyed include point process descriptions of the convex
minorant of random walks and L\'evy processes on a fixed finite interval, up to
an independent exponential time, and in the infinite horizon case. These
descriptions follow from the invariance of these processes under an adequate
path transformation. In the case of Brownian motion, we note how further
special properties of this process, including time-inversion, imply a
sequential description for the convex minorant of the Brownian meander.Comment: 11 pages, 5 figure
Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
An important goal in artificial intelligence is to create agents that can
both interact naturally with humans and learn from their feedback. Here we
demonstrate how to use reinforcement learning from human feedback (RLHF) to
improve upon simulated, embodied agents trained to a base level of competency
with imitation learning. First, we collected data of humans interacting with
agents in a simulated 3D world. We then asked annotators to record moments
where they believed that agents either progressed toward or regressed from
their human-instructed goal. Using this annotation data we leveraged a novel
method - which we call "Inter-temporal Bradley-Terry" (IBT) modelling - to
build a reward model that captures human judgments. Agents trained to optimise
rewards delivered from IBT reward models improved with respect to all of our
metrics, including subsequent human judgment during live interactions with
agents. Altogether our results demonstrate how one can successfully leverage
human judgments to improve agent behaviour, allowing us to use reinforcement
learning in complex, embodied domains without programmatic reward functions.
Videos of agent behaviour may be found at https://youtu.be/v_Z9F2_eKk4
Intra-agent speech permits zero-shot task acquisition
Human language learners are exposed to a trickle of informative,
context-sensitive language, but a flood of raw sensory data. Through both
social language use and internal processes of rehearsal and practice, language
learners are able to build high-level, semantic representations that explain
their perceptions. Here, we take inspiration from such processes of "inner
speech" in humans (Vygotsky, 1934) to better understand the role of intra-agent
speech in embodied behavior. First, we formally pose intra-agent speech as a
semi-supervised problem and develop two algorithms that enable visually
grounded captioning with little labeled language data. We then experimentally
compute scaling curves over different amounts of labeled data and compare the
data efficiency against a supervised learning baseline. Finally, we incorporate
intra-agent speech into an embodied, mobile manipulator agent operating in a 3D
virtual world, and show that with as few as 150 additional image captions,
intra-agent speech endows the agent with the ability to manipulate and answer
questions about a new object without any related task-directed experience
(zero-shot). Taken together, our experiments suggest that modelling intra-agent
speech is effective in enabling embodied agents to learn new tasks efficiently
and without direct interaction experience
NATIONAL INSTITUTIONAL FRAMEWORKS AND THE HYBRIDIZATION OF ENTREPRENEURIAL BUSINESS MODELS: THE GERMAN AND UK BIOTECHNOLOGY SECTORS
Given what institutional scholars have described as an inhospitable institutional climate for entrepreneurial business, why has the German biotechnology industry suddenly taken off, while in the UK, where a ''correct'' institutional architecture exists, has the industry shown signs of stagnation? To explain these trends the article develops a firm-centered approach, recognizing that firms work with institutional frameworks - often with help from public policies - to create new business strategies. The argument is developed that such processes are associated with the ''hybridization'' of business strategies at the micro level, combined with the generation of new constellations of particular institutional frameworks within relatively stable national models.